A CASA-labelling model using the localisation cue for robust cocktail-party speech recognition
نویسندگان
چکیده
We propose a new cocktail-party recognition technique based on the coupling of a CASA-labelling method using the TDOA (Time Delay Of Arrival) with multistream recognition. This is an alternative to the classical "segregate and recognise" architecture. First, we have recorded a stereo database ST-NB95 from the mono Numbers95. This is composed of binary mixtures of sentences at 0dB, placed left and right. The probability to get the labels "left" and "right" is assigned to the subband time frames thanks to a mapping function. This depends on the relative level. It is established a priori, using a reference database composed of isolated words recorded in the same condition. We adapt the recognition paradigm to this particular situation. The model WER of binary mixtures is about 50%. This is a great improvement relatively to the WER (73%) of the fullband PLP. We conclude the model is able to recognise the dominant words of a binary mixture.
منابع مشابه
A Casa Front-end Using the Localisation Cue for Segregation and Then Cocktail-party Speech Recognition
We propose and test a cocktail-party recognition technique based on segregation applied before recognition. This CASA front-end uses the TDOA (Time Delay Of Arrival) evaluated within subbands in order to determine the Relative Level (RL) of two competing speech sources. To perform the evaluation of the model, we have recorded a stereo database ST-NB95 from the mono Numbers95. This is composed o...
متن کاملEvaluation of CASA and BSS models for cocktail-party speech segregation
For speech segregation, a blind separation model (BSS) is tested together with a CASA model which is based on the localisation cue and the evaluation of the time delay of arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For applying the two models, we cut the frequency doma...
متن کاملEvaluation of CASA and BSS models for subband cocktail-party speech separation
For speech segregation, a recurrent blind separation model (BSS) is tested together with a CASA model, which is based on the localisation cue and the evaluation of the time delay of arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For applying the two models, we cut the fre...
متن کاملComparative evaluation of CASA and BSS models for subband cocktail-party speech separation
For speech segregation, a blind separation model (BSS) is tested together with a CASA model which is based on the localisation cue and the evaluation of the time delay of arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For applying the two models, we cut the frequency doma...
متن کاملComparative evaluation of CA for subband cocktail-party
For speech segregation, a recurrent blind separation model (BSS) is tested together with a Computational Auditory Scene Analysis (CASA) model, which is based on the localisation cue and the evaluation of the Time Delay Of Arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999